Algorithmic Statistics: Forty Years Later

نویسندگان

  • Nikolai K. Vereshchagin
  • Alexander Shen
چکیده

Algorithmic statistics has two different (and almost orthogonal) motivations. From the philosophical point of view, it tries to formalize how the statistics works and why some statistical models are better than others. After this notion of a “good model” is introduced, a natural question arises: it is possible that for some piece of data there is no good model? If yes, how often these bad (non-stochastic) data appear “in real life”? Another, more technical motivation comes from algorithmic information theory. In this theory a notion of complexity of a finite object (=amount of information in this object) is introduced; it assigns to every object some number, called its algorithmic complexity (or Kolmogorov complexity). Algorithmic statistic provides a more fine-grained classification: for each finite object some curve is defined that characterizes its behavior. It turns out that several different definitions give (approximately) the same curve.1 In this survey we try to provide an exposition of the main results in the field (including full proofs for the most important ones), as well as some historical comments. We assume that the reader is familiar with the main notions of algorithmic information (Kolmogorov complexity) theory. An exposition can be found in [44, chapters 1, 3, 4] or [22, chapters 2, 3], see also the survey [37]. ∗The work was in part funded by RFBR according to the research project grant 16-01-00362-a (N.V.) and by RaCAF ANR-15-CE40-0016-01 grant (A.S.) †N. Vereshchagin is with Moscow State University, National Research University Higher School of Economics and Yandex ‡A. Shen is with LIRMM, CNRS & Univ. Montpellier, 161 rue Ada, 34095, France 1Road-map: Section 2 considers the notion of (α, β)-stochasticity; Section 3 considers two-part descriptions and the so-called “minimal description length principle”; Section 4 gives one more approach: we consider the list of objects of bounded complexity and measure how far some object is from the end of the list, getting some natural class of “standard descriptions” as a by-product; finally, Section 5 establishes a connection between these notions and resource-bounded complexity. The rest of the paper deals with an attempts to make theory close to practice by considering restricted classes of description (Section 6) and strong models (Section 7). 1 ar X iv :1 60 7. 08 07 7v 3 [ cs .C C ] 7 M ar 2 01 7 A short survey of main results of algorithmic statistics was given in [43] (without proofs); see also the last chapter of the book [44].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Forty Years of X-Ray Binaries

In 2012 it was forty years ago that the discovery of the first X-ray binary Centaurus X-3 became known. That same year it was discovered that apart from the High-Mass X-ray Binaries (HMXBs) there are also Low-Mass X-ray Binaries (LMXBs), and that Cygnus X-1 is most probably a black hole. By 1975 also the new class of Be/X-ray binaries was discovered. After this it took 28 years before ESAs INTE...

متن کامل

Parallel Genetic Algorithm Using Algorithmic Skeleton

Algorithmic skeleton has received attention as an efficient method of parallel programming in recent years. Using the method, the programmer can implement parallel programs easily. In this study, a set of efficient algorithmic skeletons is introduced for use in implementing parallel genetic algorithm (PGA).A performance modelis derived for each skeleton that makes the comparison of skeletons po...

متن کامل

SVM Classifiers – Concepts and Applications to Character Recognition

Since the introduction of the concepts by Vladimir, a large and increasing number of researchers have worked on the algorithmic and the theoretical analysis of SVM, merging concepts from disciplines as distant as statistics, functional analysis, optimization, and machine learning. The soft margin classifier was introduced few years later by Cortes and Vapnik [1], and in 1995 the algorithm was e...

متن کامل

Parallel Genetic Algorithm Using Algorithmic Skeleton

Algorithmic skeleton has received attention as an efficient method of parallel programming in recent years. Using the method, the programmer can implement parallel programs easily. In this study, a set of efficient algorithmic skeletons is introduced for use in implementing parallel genetic algorithm (PGA).A performance modelis derived for each skeleton that makes the comparison of skeletons po...

متن کامل

REDUCE: The First Forty Years

Volker Weispfenning and his group in Passau have had a long interest in computer algebra. In this talk, REDUCE, one of the algebra programs they use, will be discussed. The development of this program began over forty years ago. We shall discuss the design decisions that have influenced its long-term survival, and the way in which the program has evolved with time.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017